Using DCOPs to Balance Exploration and Exploitation in Time-Critical Domains

نویسندگان

Matthew E. Taylor

Manish Jain

Prateek Tandon

Milind Tambe

چکیده

Substantial work has investigated balancing exploration and exploitation, but relatively little has addressed this tradeoff in the context of coordinated multi-agent interactions. This paper introduces a class of problems in which agents must maximize their on-line reward, a decomposable function dependent on pairs of agent’s decisions. Unlike previous work, agents must both learn the reward function and exploit it on-line, critical properties for a class of physicallymotivated systems, such as mobile wireless networks. This paper introduces algorithms motivated by the Distributed Constraint Optimization Problem framework and demonstrates when, and at what cost, increasing agents’ coordination can improve the global reward on such problems.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

DCOPs and bandits: exploration and exploitation in decentralised coordination

Real life coordination problems are characterised by stochasticity and a lack of a priori knowledge about the interactions between agents. However, decentralised constraint optimisation problems (DCOPs), a widely adopted framework for modelling decentralised coordination tasks, assumes perfect knowledge of these factors, thus limiting its practical applicability. To address this shortcoming, we...

متن کامل

Balance Within and Across Domains: The Performance Implications of Exploration and Exploitation in Alliances

Organizational research advocates that firms balance exploration and exploitation, yet it acknowledges inherent challenges in reconciling these opposing activities. To overcome these challenges, such research suggests that firms establish organizational separation between exploring and exploiting units or engage in temporal separation whereby they oscillate between exploration and exploitation ...

متن کامل

Balancing Exploration and Exploitation in Alliance Formation

Do firms balance exploration and exploitation in their alliance formation decisions and, if so, why and how? We argue that absorptive capacity and organizational inertia impose conflicting pressures for exploration and exploitation with respect to the value chain function of alliances, the attributes of partners, and partners’ network positions. Although path dependencies reinforce either explo...

متن کامل

Decentralized multi-agent reinforcement learning in average-reward dynamic DCOPs

Researchers have introduced the Dynamic Distributed Constraint Optimization Problem (Dynamic DCOP) formulation to model dynamically changing multi-agent coordination problems, where a dynamic DCOP is a sequence of (static canonical) DCOPs, each partially different from the DCOP preceding it. Existing work typically assumes that the problem in each time step is decoupled from the problems in oth...

متن کامل

An Improved Bat Algorithm with Grey Wolf Optimizer for Solving Continuous Optimization Problems

Metaheuristic algorithms are used to solve NP-hard optimization problems. These algorithms have two main components, i.e. exploration and exploitation, and try to strike a balance between exploration and exploitation to achieve the best possible near-optimal solution. The bat algorithm is one of the metaheuristic algorithms with poor exploration and exploitation. In this paper, exploration and ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2009

Using DCOPs to Balance Exploration and Exploitation in Time-Critical Domains

نویسندگان

چکیده

منابع مشابه

DCOPs and bandits: exploration and exploitation in decentralised coordination

Balance Within and Across Domains: The Performance Implications of Exploration and Exploitation in Alliances

Balancing Exploration and Exploitation in Alliance Formation

Decentralized multi-agent reinforcement learning in average-reward dynamic DCOPs

An Improved Bat Algorithm with Grey Wolf Optimizer for Solving Continuous Optimization Problems

عنوان ژورنال:

اشتراک گذاری